NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

All-FIT: Allele-Frequency-based Imputation of Tumor Purity from High-Depth Sequencing Data

https://doi.org/10.1093/bioinformatics/btz865

Loh, Jui Wan; Guccione, Caitlin; Di Clemente, Frances; Riedlinger, Gregory; Ganesan, Shridar; Khiabanian, Hossein; Hancock, John (November 2019, Bioinformatics)

Abstract Motivation Clinical sequencing aims to identify somatic mutations in cancer cells for accurate diagnosis and treatment. However, most widely used clinical assays lack patient-matched control DNA and additional analysis is needed to distinguish somatic and unfiltered germline variants. Such computational analyses require accurate assessment of tumor cell content in individual specimens. Histological estimates often do not corroborate with results from computational methods that are primarily designed for normal-tumor matched data and can be confounded by genomic heterogeneity and presence of sub-clonal mutations. Methods All-FIT is an iterative weighted least square method to estimate specimen tumor purity based on the allele frequencies of variants detected in high-depth, targeted, clinical sequencing data. Results Using simulated and clinical data, we demonstrate All-FIT’s accuracy and improved performance against leading computational approaches, highlighting the importance of interpreting purity estimates based on expected biology of tumors. Availability and Implementation Freely available at http://software.khiabanian-lab.org. Supplementary information Supplementary data are available at Bioinformatics online.
more » « less
Full Text Available
DeepIsoFun: a deep domain adaptation approach to predict isoform functions

https://doi.org/10.1093/bioinformatics/bty1017

Shaw, Dipan; Chen, Hao; Jiang, Tao; Hancock, John (December 2018, Bioinformatics)

Abstract Motivation Isoforms are mRNAs produced from the same gene locus by alternative splicing and may have different functions. Although gene functions have been studied extensively, little is known about the specific functions of isoforms. Recently, some computational approaches based on multiple instance learning have been proposed to predict isoform functions from annotated gene functions and expression data, but their performance is far from being desirable primarily due to the lack of labeled training data. To improve the performance on this problem, we propose a novel deep learning method, DeepIsoFun, that combines multiple instance learning with domain adaptation. The latter technique helps to transfer the knowledge of gene functions to the prediction of isoform functions and provides additional labeled training data. Our model is trained on a deep neural network architecture so that it can adapt to different expression distributions associated with different gene ontology terms. Results We evaluated the performance of DeepIsoFun on three expression datasets of human and mouse collected from SRA studies at different times. On each dataset, DeepIsoFun performed significantly better than the existing methods. In terms of area under the receiver operating characteristics curve, our method acquired at least 26% improvement and in terms of area under the precision-recall curve, it acquired at least 10% improvement over the state-of-the-art methods. In addition, we also study the divergence of the functions predicted by our method for isoforms from the same gene and the overall correlation between expression similarity and the similarity of predicted functions. Availability and implementation https://github.com/dls03/DeepIsoFun/ Supplementary information Supplementary data are available at Bioinformatics online.
more » « less
Full Text Available
Statistical tests for detecting variance effects in quantitative trait studies

https://doi.org/10.1093/bioinformatics/bty565

Dumitrascu, Bianca; Darnell, Gregory; Ayroles, Julien; Engelhardt, Barbara E; Hancock, John (July 2018, Bioinformatics)

Full Text Available
An accurate and powerful method for copy number variation detection

https://doi.org/10.1093/bioinformatics/bty1041

Xiao, Feifei; Luo, Xizhi; Hao, Ning; Niu, Yue S; Xiao, Xiangjun; Cai, Guoshuai; Amos, Christopher I; Zhang, Heping; Hancock, John (January 2019, Bioinformatics)

Full Text Available

Search for: All records